KNN-CF Approach: Incorporating Certainty Factor to kNN Classification

نویسنده

  • Shizhao Zhang
چکیده

 Abstract—KNN classification finds k nearest neighbors of a query in training data and then predicts the class of the query as the most frequent one occurring in the neighbors. This is a typical method based on the majority rule. Although majority-rule based methods have widely and successfully been used in real applications, they can be unsuitable to the learning setting of skewed class distribution. This paper incorporates certainty factor (CF) measure to kNN classification, called kNN-CF classification, so as to deal with the above issue. This CF-measure based strategy can be applied on the top of a kNN classification algorithm (or a hot-deck method) to meet the need of imbalanced learning. This leads to that an existing kNN classification algorithm can easily be extended to the setting of skewed class distribution. Some experiments are conducted for evaluating the efficiency, and demonstrate that the kNN-CF classification outperforms standard kNN classification in accuracy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Scheme for Improving Accuracy of KNN Classification Algorithm Based on the New Weighting Technique and Stepwise Feature Selection

K nearest neighbor algorithm is one of the most frequently used techniques in data mining for its integrity and performance. Though the KNN algorithm is highly effective in many cases, it has some essential deficiencies, which affects the classification accuracy of the algorithm. First, the effectiveness of the algorithm is affected by redundant and irrelevant features. Furthermore, this algori...

متن کامل

Improving K-nearest-neighborhood based Collaborative Filtering via Similarity Support

Collaborative Filtering (CF) is the most popular choice when implementing personalized recommender systems. A classical approach to CF is based on K-nearest-neighborhood (KNN) model, where the precondition for making recommendations is the KNN construction for involved entities. However, when building KNN sets, there exits the dilemma to decide the value of K --a small value will lead to poor r...

متن کامل

Evolutionary fuzzy k-nearest neighbors algorithm using interval-valued fuzzy sets

One of the most known and effective methods in supervised classification is the K-Nearest Neighbors classifier. Several approaches have been proposed to enhance its precision, with the Fuzzy K-Nearest Neighbors (Fuzzy-kNN) classifier being among the most successful ones. However, despite its good behavior, Fuzzy-kNN lacks of a method for properly defining several mechanisms regarding the repres...

متن کامل

PARALLEL kNN ON GPU ARCHITECTURE USING OpenCL

In data mining applications, one of the useful algorithms for classification is the kNN algorithm. The kNN search has a wide usage in many research and industrial domains like 3-dimensional object rendering, content-based image retrieval, statistics, biology (gene classification), etc. In spite of some improvements in the last decades, the computation time required by the kNN search remains the...

متن کامل

Breast Cancer Diagnosis from Perspective of Class Imbalance

Introduction: Breast cancer is the second cause of mortality among women. Early detection is the only rescue to reduce the risk of breast cancer mortality. Traditional methods cannot effectively diagnose tumor since they are based on the assumption of well-balanced dataset.. However, a hybrid method can help to alleviate the two-class imbalance problem existing in the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE Intelligent Informatics Bulletin

دوره 11  شماره 

صفحات  -

تاریخ انتشار 2010